Mutual benefits: Combining reinforcement learning with sequential sampling models

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Stochastic Task Models with Reinforcement Learning for Dynamic Scheduling

We view dynamic scheduling as a sequential decision problem. Firstly, we introduce a generalized planning operator, the stochastic task model (STM), which predicts the effects of executing a particular task on state, time and reward using a general procedural format (pure stochastic function). Secondly, we show that effective planning under uncertainty can be obtained by combining adaptive hori...

متن کامل

Sequential Sampling Plan with Fuzzy Parameters

In this paper a new sequential sampling plan is introduced in which the acceptable quality level (AQL) and the lot tolerance percent defective (LTPD) are a fuzzy number. This plan is well defined, since, if the parameters are crisp, it changes to a classical plan. For such a plan, a particular table of rejection and acceptance is calculated and compared with the classical one. Keywords : St...

متن کامل

Combining Reinforcement Learning with Symbolic Planning

One of the major difficulties in applying Q-learning to realworld domains is the sharp increase in the number of learning steps required to converge towards an optimal policy as the size of the state space is increased. In this paper we propose a method, PLANQ-learning, that couples a Q-learner with a STRIPS planner. The planner shapes the reward function, and thus guides the Q-learner quickly ...

متن کامل

Importance Sampling for Reinforcement Learning with Multiple

This thesis considers three complications that arise from applying reinforcement learning to a real-world application. In the process of using reinforcement learning to build an adaptive electronic market-maker, we find the sparsity of data, the partial observability of the domain, and the multiple objectives of the agent to cause serious problems for existing reinforcement learning algorithms....

متن کامل

Reinforcement Learning in Parameterized Models Reinforcement Learning with Polynomial Learning Rate in Parameterized Models

We consider reinforcement learning in a parameterized setup, where the model is known to belong to a finite set of Markov Decision Processes (MDPs) under the discounted return criterion. We propose an on-line algorithm for learning in such parameterized models, the Parameter Elimination (PEL) algorithm, and analyze its performance in terms of the total mistakes. The algorithm relies on Wald’s s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neuropsychologia

سال: 2020

ISSN: 0028-3932

DOI: 10.1016/j.neuropsychologia.2019.107261